Pronunciation Variation Speech Recognition without New Dictionary Construction
نویسندگان
چکیده
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined dictionary cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. This paper presents efficient strategies for both training and decoding of a continuous speech recognition system: tree of knowledge-based pronunciation variations re-label training and state-level pronunciation variation model, respectively. These strategies can efficiently support the variations in pronunciation according to the rules without necessity to make pronunciation variation dictionary. The pronunciation variation training is modified from the re-label training to obtain the maximum likelihood pronunciation during training in order to reduce the error in an acoustic model. Although the database and rules used in this paper is Thai, this system can also be adapted to other languages easily as the variations are controlled by simple rules. The system shows better performance in the experiment.
منابع مشابه
Pronunciation variation speech recognition without dictionary modification on sparse database
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined lexicon cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. Sharing gaussian densities across phonetic model...
متن کاملModeling Pronunciation Variation for Cantonese Speech Recognition
Due to the large variability of pronunciation in spontaneous speech, pronunciation modeling becomes a more challenging and essential part in speech recognition. In this paper, we describe two different approaches of pronunciation modeling by using decision tree. At lexical level, a pronunciation variation dictionary is built to obtain alternative pronunciations for each word, in which each entr...
متن کاملEnhanced tree clustering with single pronunciation dictionary for conversational speech recognition
Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show tha...
متن کاملPronunciation Modeling for Spontaneous Speech by Maximizing Word Correct Rate in a Production- Recognition Model
In this paper, we develop a new method for compiling a pronunciation dictionary to model pronunciation variation in spontaneous speech recognition. The pronunciation dictionary is assembled by iteratively selecting pronunciations from a datadriven word confusion table, based on directly maximizing the word correct rate simulated by a production-recognition model such that the optimal performanc...
متن کاملImproving pronunciation modeling for non-native speech recognition
In this paper, three different approaches to pronunciation modeling are investigated. Two existing pronunciation modeling approaches, namely the pronunciation dictionary and n-best rescoring approach are modified to work with little amount of non-native speech. We also propose a speaker clustering approach, which capable of grouping the speakers based on their pronunciation habits. Given some s...
متن کامل